Setting Up KrigR

Installing KrigR

First of all, we need to install KrigR.

Until we have implemented our development ideas and goals, KrigR will not be submitted to CRAN to avoid the hassle of updating an already accepted package.

For the time being, KrigR is only available through the associated GitHub repository.

Here, we use the devtools package within R to get access to the install_github() function. For this to run, we need to tell R to not stop the installation from GitHub as soon as a warning is produced. This would stop the installation process as early as one of the KrigR dependencies hits a warning as benign as Package XYZ was built under R Version X.X.X which can usually be ignored safely.

Subsequently, KrigR can be installed and loaded as follows:

Sys.setenv(R_REMOTES_NO_ERRORS_FROM_WARNINGS="true")
devtools::install_github("https://github.com/ErikKusch/KrigR")
library(KrigR)

CDS API Access

Before you can access Climate Data Store (CDS) products through KrigR, you need to open up an account at the CDS and create an API key. Once you have done so, we recommend you register the user ID and API Key as characters as seen below (this will match the naming scheme in our workshop material):

API_User <- 12345
API_Key <- "YourApiKeyGoesHereAsACharacterString"

Don’t forget to sign the terms and conditions of the CDS!

You are now ready for KrigR.

Preparing the Workshop

R Packages for the Workshop

For the sake of this series of tutorials, we need some extra packages for visualisations. To load them, we use a custom function (install.load.package(), see below). This function checks whether a package is already installed, subsequently install (if necessary) and loads the package. To carry this operation out for several packages, we simply apply it to a vector of package names using sapply():

install.load.package <- function(x){
  if (!require(x, character.only = TRUE))
    install.packages(x, repos='http://cran.us.r-project.org')
  require(x, character.only = TRUE)
}
package_vec <- c("tidyr", # for turning rasters into ggplot-dataframes
                 "ggplot2", # for plotting
                 "viridis", # colour palettes
                 "cowplot", # gridding multiple plots
                 "ggmap", # obtaining satellite maps
                 "gimms", # to get some pre-existing data to match in our downscaling
                 "rnaturalearth", # for shapefiles
                 "rnaturalearthdata", # for high-resolution shapefiles
                 "mapview" # for generating mapview outputs
                 )
sapply(package_vec, install.load.package)
##             tidyr           ggplot2           viridis           cowplot             ggmap 
##              TRUE              TRUE              TRUE              TRUE              TRUE 
##             gimms     rnaturalearth rnaturalearthdata           mapview 
##              TRUE              TRUE              TRUE              TRUE

Setting up Directories

The workshop is designed to run completely from scratch. For this to work in a structured way, we create a folder/directory structure so that we got some nice compartments on our hard drives. We create the following directories:

  • A Data directory for all of our data downloads
  • A Covariate directory for all of our covariate data
  • An Exports directory for all of our Kriging outputs
Dir.Base <- getwd() # identifying the current directory
Dir.Data <- file.path(Dir.Base, "Data") # folder path for data
Dir.Covariates <- file.path(Dir.Base, "Covariates") # folder path for covariates
Dir.Exports <- file.path(Dir.Base, "Exports") # folder path for exports
## create directories, if they don't exist yet
Dirs <- sapply(c(Dir.Data, Dir.Covariates, Dir.Exports), 
               function(x) if(!dir.exists(x)) dir.create(x))

Visualiation Functions

In order to easily visualise our Kriging procedure including (1) inputs, (2) covariates, and (3) outputs without repeating too much of the same code, we have prepared some plotting functions (FUN_Plotting.R).

With the FUN_Plotting.R file placed in the project directory of your workshop material (i.e., the directory returned by Dir.Base), running the following will register the three plotting functions in your R environment.

source("FUN_Plotting.R")

The plotting functions you have just loaded are called:

  • Plot_Raw() - we will use this function to visualise data downloaded with KrigR
  • Plot_Covs() - this function will help us visualise the covariates we use for statistical interpolation
  • Plot_Krigs() - kriged products and their associated uncertainty will be visualised using this function

Don’t worry about understanding how these functions work off the bat here. Kriging and the package KrigR are what we want to demonstrate here - not visualisation strategies.

Locations of Interest

Our Workshop Target Region

To keep this workshop material concise and make it so you don’t need access to a server of cluster throughout the following demonstrations of KrigR, we will specify a set of locations in which we are interested.

The locations we focus on for this workshop are situated throughout eastern Germany and the north-western parts of the Czech Republic. Why do we focus on this particular part of the Earth? There are three reasons:

  1. Topographical Heterogeneity - the area we select here contains large swaths of flat lowlands as well as some mountain ridges. This will make for visually pleasing plots and highlight the capability of kriging.
  2. Geographic Scale - the area we are selecting here hits a certain sweet-spot for our purposes as its size makes it so that all KrigR functions run to completion in a relatively short time.
  3. Familiarity - I was born and grew up in this region and have fond memories of the place. Please excuse my indulging in a bit of nostalgia.

Change the locations of interest at your own risk.

Using a different set of locations than the ones we specify here will change computational load and time as well as disk space required when working through the workshop material.

KrigR will be able to get you the data you want for the locations you desire, but computational requirements will vary.

Spatial Preferences in KrigR

KrigR is capable of learning about your spatial preferences in three ways:
1. As an extent input (a rectangular box).
2. As a SpatialPolygons input (a polygon or set of polygons).
3. As a set of locations stored in a data.frame.

To demonstrate the range of specifications permitted in KrigR, we make use of all three specifications. Masking out unnecessary areas from our analyses speeds up Kriging tremendously hence why we strongly suggest you make use of SpatialPolygons or data.frames whenever possible.

Area of Interest (extent)

The simplest way in which you can run the functions of the KrigR package is by specifying a rectangular bounding box (i.e., an extent) to specify your study region(s). We simply specify the longitude and latitude ranges and store the object as an extent:

Extent_ext <- extent(c(9.87, 15.03, 49.89, 53.06))

Shape of Interest (SpatialPolygons)

To define SpatialPolygons for our purposes, I make use of the NaturalEarthData. Here, I select a set of polygons corresponding to some states in Germany and the Czech Republic:

Shape_shp <- ne_states(country = c("Germany", "Czech Republic"))
Shape_shp <- Shape_shp[Shape_shp$name_en %in% c("Saxony", "Saxony-Anhalt", "Thuringia", 
                                                "Ústí nad Labem Region", "Karlovy Vary Region"), ]

The above requires the naturalhighres package which can give some users troubles.

Here’s a workaround if naturalhighres does not work for you:

download.file("https://www.naturalearthdata.com/http//www.naturalearthdata.com/download/10m/cultural/ne_10m_admin_1_states_provinces.zip",
              destfile = "highres.zip")
unzip("highres.zip")
Shape_shp <- readOGR("ne_10m_admin_1_states_provinces.shp")
Shape_shp <- Shape_shp[Shape_shp$name_en %in% c("Saxony", "Saxony-Anhalt", "Thuringia",
                                                "Ústí nad Labem", "Karlovy Vary"), ]

Points of Interest (data.frame)

Finally, to represent specific points of interest, I have prepared a small data set of mountains for each state in the shapefile above. This file is called Mountains_df.RData.

The Mountains_df.RData file is found in this .zip file.

Simply place this file into your data directory and continue the workshop.

Let’s load this data set and quickly visualise it:

load(file.path(Dir.Data, "Mountains_df.RData")) # load an sp object called Mountains_sp
Mountains_df
##          Mountain      Lon      Lat
## 1     Fichtelberg 12.95472 50.42861
## 2         Brocken 10.61722 51.80056
## 3 Großer Beerberg 10.74611 50.65944
## 4        Meluzína 13.00778 50.39028
## 5       Milešovka 13.93153 50.55523

We now have all of our objects for spatial preferences ready for the workshop.

Visualising our Study Setting

To finish our preparations for this workshop, let’s visualise the different locations of interest:

## Establish rectangular bounding box from extent
bbox <- as.numeric(as(Extent_ext, 'SpatialPolygons')@bbox)
names(bbox) <- c("left", "bottom", "right", "top")

## Make locations of mountains into SpatialPoints
Mountains_sp <- Mountains_df
coordinates(Mountains_sp) <- ~Lon+Lat

## download a map of the area specified by the extent
back_gg <- get_map(bbox, maptype = 'terrain')

## combine locations of interest into one plot
ggmap(back_gg, extent = "device") + # plot the extent area
  ## display the SpatialPolygons area
  geom_polygon(aes(x = long, y = lat, group = id), data = fortify(Shape_shp),
               colour = 'black', size = 1, fill = 'black', alpha = .5) + 
  ## add the data.frame data
  geom_point(aes(x = Lon, y = Lat), data = data.frame(Mountains_sp), 
             colour = "red", size = 4, pch = 13) + 
  ## some style additions
  theme_bw() + labs(x= "Longitude [°]", y = "Latitude  [°]") + 
  theme(plot.margin=unit(c(0, 1, 0, 1),"lines"))

In the above figure, the map area designates the extent specifications while the grey overlay display the SpatialPolygons preference and points of interest (form our data.frame input) are highlighted with red plotting symbols.

We are now ready to start the KrigR portion of the workshop!

The KrigR Workflow

Downloads with KrigR

To streamline this workshop material, I will focus on one just three short-time series of data with different spatial limitations. I visualise them all side-by-side further down.

Let’s start with a very basic call to download_ERA().

For this part of the workshop, we download air temperature for my birth month (January 1995) using the extent of our target region.

If you want to know about the defaults for any argument in download_ERA() simply run ?download_ERA(). Doing so should make it obvious why we specify the function as we do below.

Notice that the downloading of ERA-family reanalysis data may take a short while to start as the download request gets queued with the CDS of the ECMWF before it is executed.

Extent_Raw <- download_ERA(
  Variable = "2m_temperature",
  DataSet = "era5-land",
  DateStart = "1995-01-01",
  DateStop = "1995-01-04",
  TResolution = "day",
  TStep = 1,
  Extent = Extent_ext,
  Dir = Dir.Data,
  FileName = "ExtentRaw",
  API_User = API_User,
  API_Key = API_Key
)
## download_ERA() is starting. Depending on your specifications, this can take a significant time.
## User 39340 for cds service added successfully in keychain
## Staging 1 download(s).
## 0001_ExtentRaw.nc download queried
## Requesting data to the cds service with username 39340
## - staging data transfer at url endpoint or request id:
##   3e205e42-5748-4987-a2eb-fd408b19cba3
## - timeout set to 10.0 hours
## - polling server for a data transfer \ polling server for a data transfer | polling
## server for a data transfer / polling server for a data transfer - polling server for a
## data transfer \ polling server for a data transfer | polling server for a data transfer /
## polling server for a data transfer - polling server for a data transfer \ polling server
## for a data transfer | polling server for a data transfer / polling server for a data
## transfer - polling server for a data transfer \ polling server for a data transfer |
## polling server for a data transfer / polling server for a data transfer - polling server
## for a data transfer \ polling server for a data transfer | polling server for a data
## transfer / polling server for a data transfer - polling server for a data transfer \
## polling server for a data transfer | polling server for a data transfer / polling server
## for a data transfer
## 
  |                                                                                      
  |                                                                                |   0%
  |                                                                                      
  |==                                                                              |   2%
  |                                                                                      
  |====                                                                            |   5%
  |                                                                                      
  |======                                                                          |   7%
  |                                                                                      
  |========                                                                        |  10%
  |                                                                                      
  |==========                                                                      |  12%
  |                                                                                      
  |===========                                                                     |  14%
  |                                                                                      
  |=============                                                                   |  17%
  |                                                                                      
  |===============                                                                 |  19%
  |                                                                                      
  |=================                                                               |  21%
  |                                                                                      
  |===================                                                             |  24%
  |                                                                                      
  |=====================                                                           |  26%
  |                                                                                      
  |======================                                                          |  28%
  |                                                                                      
  |========================                                                        |  30%
  |                                                                                      
  |==========================                                                      |  32%
  |                                                                                      
  |============================                                                    |  34%
  |                                                                                      
  |=============================                                                   |  37%
  |                                                                                      
  |===============================                                                 |  39%
  |                                                                                      
  |=================================                                               |  41%
  |                                                                                      
  |===================================                                             |  44%
  |                                                                                      
  |=====================================                                           |  46%
  |                                                                                      
  |=======================================                                         |  48%
  |                                                                                      
  |=========================================                                       |  51%
  |                                                                                      
  |==========================================                                      |  53%
  |                                                                                      
  |============================================                                    |  55%
  |                                                                                      
  |==============================================                                  |  58%
  |                                                                                      
  |================================================                                |  60%
  |                                                                                      
  |==================================================                              |  62%
  |                                                                                      
  |====================================================                            |  65%
  |                                                                                      
  |======================================================                          |  67%
  |                                                                                      
  |========================================================                        |  70%
  |                                                                                      
  |==========================================================                      |  72%
  |                                                                                      
  |============================================================                    |  74%
  |                                                                                      
  |=============================================================                   |  77%
  |                                                                                      
  |===============================================================                 |  79%
  |                                                                                      
  |=================================================================               |  81%
  |                                                                                      
  |===================================================================             |  84%
  |                                                                                      
  |=====================================================================           |  86%
  |                                                                                      
  |=======================================================================         |  88%
  |                                                                                      
  |========================================================================        |  90%
  |                                                                                      
  |==========================================================================      |  93%
  |                                                                                      
  |============================================================================    |  95%
  |                                                                                      
  |==============================================================================  |  97%
  |                                                                                      
  |================================================================================|  99%
  |                                                                                      
  |================================================================================| 100%
## - moved temporary file to -> C:/Users/erike/Documents/Work/Z - Conferences-Workshops_Presentations/2022_06_15-17 - [NordOikos]/KrigR Workshop/Data/0001_ExtentRaw.nc
## - request purged from queue!
## Checking for known data issues.
## Loading downloaded data for masking and aggregation.
## Aggregating to temporal resolution of choice
## 
  |                                                                                      
  |                                                                                |   0%
  |                                                                                      
  |===========================                                                     |  33%
  |                                                                                      
  |=====================================================                           |  67%
  |                                                                                      
  |================================================================================| 100%
## 

As you can see the download_ERA() function updates you on what it is currently working on at each major step. I implemented this to make sure people don’t get too anxious staring at an empty console in R. If this feature is not appealing to you, you can turn this progress tracking off by setting verbose = FALSE in the function call to download_ERA().

For the rest of this workshop, I suppress messages from download_ERA() via other means so that when you execute, you get progress tracking.

I will make exceptions to this rule when there are special things I want to demonstrate.

Now, let’s look at the raster that was produced:

Extent_Raw
## class      : RasterStack 
## dimensions : 34, 54, 1836, 4  (nrow, ncol, ncell, nlayers)
## resolution : 0.09999999, 0.09999998  (x, y)
## extent     : 9.72, 15.12, 49.74, 53.14  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## names      :       X1,       X2,       X3,       X4 
## min values : 270.8622, 268.0212, 266.9560, 262.7403 
## max values : 275.1170, 273.2443, 272.5641, 270.0795

If you are keen-eyed, you will notice that the extent on this object does not align with the extent we supplied with Extent_ext. The reason? To download the data, we need to snap to the nearest full cell in the data set from which we query our downloads. KrigR always ends up widening the extent to ensure all the data you desire will be downloaded.

Finally, let’s visualise our downloaded data with one of our user-defined plotting functions:

Plot_Raw(Extent_Raw, Dates = c("01-1995", "02-1995", "03-1995", "04-1995"))

That is all there is to downloading ERA5(-Land) data with KrigR. You can already see how, even at the relatively course resolution of ERA5-Land, the mountain ridges along the German-Czech border are showing up. This will become a lot clearer of a pattern once we downscale our data.

download_ERA() provides you with a lot more functionality than just access to the ERA5(-Land) data sets.

With download_ERA(), you can also carry out processing of the downloaded data. Data processing with download_ERA() includes:
1. Spatial Limitation to cut down on the data that is stored on your end.
2. Temporal Aggregation to establish data at the temporal resolution you desire.

Spatial Limitation

Let’s start with spatial limitation. As discussed previously, download_ERA() can handle a variety of inputs describing spatial preferences.

KrigR is capable of learning about your spatial preferences in three ways:
1. As an extent input (a rectangular box).
2. As a SpatialPolygons input (a polygon or set of polygons).
3. As a set of locations stored in a data.frame.
These spatial preferences are registered in KrigR functions using the Extent argument.

You might now ask yourself: How does KrigR achieve spatial limitation of the data? Couldn’t we just simply download only the data we are interested in?

The ECMWF CDS gives us tremendous capability of retrieving only the data we want. However, the CDS only recognises rectangular boxes (i.e., extents) for spatial limitation. Consequently, we always have to download data corresponding to a rectangular box in space. When informing KrigR of your spatial preferences using a data.frame or SpatialPolygons, download_ERA() automatically (1) identifies the smallest extent required by your input, (2) downloads data corresponding to this extent, and (3) masks our any data not queried by you.

Using KrigR’s spatial limitation features ensures faster computation and smaller file sizes (depending on file type).

In the following, I demonstrate how to use the Extent argument in download_ERA().

Shape (SpatialPolygons)

Let me show you how SpatialPolygons show up in our data with download_ERA(). First, we query our download as follows:

SpatialPolygonsRaw <- download_ERA(
  Variable = "2m_temperature",
  DataSet = "era5-land",
  DateStart = "1995-01-03",
  DateStop = "1995-01-03",
  TResolution = "day",
  TStep = 1,
  Extent = Shape_shp,
  Dir = Dir.Data,
  FileName = "SpatialPolygonsRaw",
  API_User = API_User,
  API_Key = API_Key
)
Plot_Raw(SpatialPolygonsRaw, 
         Shp = Shape_shp,
         Dates = c("01-1995", "02-1995", "03-1995", "04-1995")) 

You will find that the data retained with the spatial limitation in download_ERA() contains all raster cells of which even a fraction falls within the bounds of the SpatialPolygons you supplied. This is different from standard raster masking through which only cells whose centroids fall within the SpatialPolygons are retained.

raster masking in KrigR always ensures that the entire area of your spatial preferences are retained.

Points (data.frame)

Now we move on to point-locations. Often times, we are researching very specific sets of coordinates, rather than entire regions. download_ERA() is capable of limiting data to only small areas (of a size of your choosing) around your point-locations. For our purposes here, we make use of a set of mountain-top coordinates throughout our study region.

This time around, we need to tell download_ERA() about not just the Extent, but also specify how much of a buffer (Buffer in \(°\)) to retain data for around each individual (ID) location.

The data.frame input to the Extent must contain a column called Lat and a column called Lon:
In addition, one must also specify:
1. A Buffer in \(°\) to be drawn around each location.
2. The name of the ID column in your data.frame which indexes each individual location.

Our development goals include support for a broader range of point-location specifications.

Let’s stage such a download:

Points_Raw <- download_ERA(
  Variable = "2m_temperature",
  DataSet = "era5-land",
  DateStart = "1995-01-01",
  DateStop = "1995-01-4",
  TResolution = "day",
  TStep = 1,
  Extent = Mountains_df,
  Buffer = 0.5,
  ID = "Mountain",
  Dir = Dir.Data,
  FileName = "PointsRaw",
  API_User = API_User,
  API_Key = API_Key
)
Plot_Raw(Points_Raw, Dates = c("01-1995", "02-1995", 
                              "03-1995", "04-1995")) + 
  geom_point(aes(x = Lon, y = Lat), data = data.frame(Mountains_sp), 
             colour = "green", size = 10, pch = 14)

Above you can see how the mountain tops we are interested in lie exactly at the centre of the retained data. As we will see later, such spatial limitation greatly reduces computation cost of statistical downscaling procedures.

Dynamical Data Uncertainty

With climate reanalyses, you also gain access to uncertainty flags of the data stored in the reanalysis product. For the ERA5-family of products, this uncertainty can be obtained by assessing the standard deviation of the 10 ensemble members which make up the underlying ERA5 model exercise.

For an aggregate understanding of data uncertainty, we also obtain dynamical uncertainty for our target region and time frame. For simplicity, we do so only for the SpatialPolygons specification.

With download_ERA() you can obtain this information as follows:

SpatialPolygonsEns <- download_ERA(
    Variable = "2m_temperature",
    DataSet = "era5",
    Type = "ensemble_members",
    DateStart = "1995-01-01",
    DateStop = "1995-01-04",
    TResolution = "day",
    TStep = 1,
    FUN = sd,
    Extent = Shape_shp,
    Dir = Dir.Data,
    FileName = "SpatialPolygonsEns",
    API_User = API_User,
    API_Key = API_Key
  )
Plot_Raw(SpatialPolygonsEns, Dates = c("01-1995", "02-1995", 
                                          "03-1995", "04-1995"),
         Shp = Shape_shp, COL = rev(viridis(100)))

As you can see here, there is substantial disagreement between the ensemble members of daily average temperatures across our study region. This uncertainty among ensemble members is greatest at high temporal resolution and becomes negligible at coarse temporal resolution. We document this phenomenon in this publication (Figure 1).

We will see how these uncertainties stack up against other sources of uncertainty when we arrive at aggregate uncertainty of our final product.

Statistical Downscaling with KrigR

Statistical downscaling with KrigR is handled via the krigR() function and requires a set of spatial covariates.

For an introduction to the statistical downscaling process, I will first walk you through the SpatialPolygons spatial preference.

Covariates

First, we use the download_DEM() function which comes with KrigR to obtain elevation data as our covariate of choice. This produces two rasters:
1. A raster of training resolution which matches the input data in all attributes except for the data in each cell.
2. A raster of target resolution which matches the input data as closely as possible in all attributes except for the resolution (which is specified by the user).

Both of these products are bundled into a list where the first element corresponds to the training resolution and the second element contains the target resolution covariate data. Here, we specify a target resolution of .02.

This is how we specify download_DEM() to prepare DEM covariates for us:

Covs_ls <- download_DEM(Train_ras = SpatialPolygonsRaw, # the data we want to downscale
                        Target_res = .02, # the resolution we want to downscale to
                        Shape = Shape_shp, # extra spatial preferences
                        Dir = Dir.Covariates # where to store the covariate files
                        )

For now, let’s simply inspect our list of covariate rasters:

Covs_ls
## [[1]]
## class      : RasterLayer 
## dimensions : 34, 54, 1836  (nrow, ncol, ncell)
## resolution : 0.1000189, 0.09999998  (x, y)
## extent     : 9.726991, 15.12801, 49.75, 53.15  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : DEM 
## values     : 20.11554, 861.7248  (min, max)
## 
## 
## [[2]]
## class      : RasterLayer 
## dimensions : 204, 324, 66096  (nrow, ncol, ncell)
## resolution : 0.01666667, 0.01666667  (x, y)
## extent     : 9.72486, 15.12486, 49.74986, 53.14986  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : DEM 
## values     : 15.75, 1173  (min, max)

You will find that the target resolution covariate data comes at a resolution of 0.017 instead of the 0.02 resolution we specified. This happens because download_DEM() calls upon the raster::aggregate() function when aggregating the high-resolution covariate data to your desired target resolution and is thus only capable of creating target-resolution covariates in multiples of the base resolution of the GMTED 2010 DEM we are using as our default covariate. This happens only when the Target_res argument is specified to be a number.

Specifying the Target_res argument as a number will lead to best approximation of the desired resolution due to usage of the raster::aggregate() within download_DEM(). If you need an exact resolution to match pre-existing data, please refer to the third-party section of this workshop material.

Before moving on, let’s visualise the covariate data:

Plot_Covs(Covs_ls, Shape_shp)

Kriging

Kriging can be a very computationally expensive exercise.

The expense of kriging is largely determined by three factors:
1. Change in spatial resolution.
2. Number of cells containing data; i.e. Spatial Limitation.
3. Localisation of Kriging; i.e. Localisation of Results.

We explore two of these further down in this workshop material. For more information, please consult this publication (Figure 4).

Finally, we are ready to interpolate our input data given our covariates with the krigR() function:

SpatialPolygonsKrig <- krigR(Data = SpatialPolygonsRaw, # data we want to krig as a raster object
      Covariates_coarse = Covs_ls[[1]], # training covariate as a raster object
      Covariates_fine = Covs_ls[[2]], # target covariate as a raster object
      Keep_Temporary = FALSE, # we don't want to retain the individually kriged layers on our hard-drive
      Cores = 1, # we want to krig on just one core
      FileName = "SpatialPolygonsKrig", # the file name for our full kriging output
      Dir = Dir.Exports # which directory to save our final input in
      )
## Commencing Kriging
## Kriging of remaining 3 data layers should finish around: 2022-06-07 12:40:25
## 
  |                                                                                      
  |                                                                                |   0%
  |                                                                                      
  |====================                                                            |  25%
  |                                                                                      
  |========================================                                        |  50%
  |                                                                                      
  |============================================================                    |  75%
  |                                                                                      
  |================================================================================| 100%

Just like with the download_ERA() function, krigR() updates you on what it is currently working on. Again, I implemented this to make sure people don’t get too anxious staring at an empty console in R. If this feature is not appealing to you, you can turn this progress tracking off by setting verbose = FALSE in the function call to krigR().

For the rest of this workshop, I suppress messages from krigR() via other means so that when you execute, you get progress tracking.

There we go. As output of the krigR() function, we obtain a list of downscaled data as the first element and downscaling standard errors as the second list element. Let’s look at that:

SpatialPolygonsKrig[-3] # we talk later about the third element separately
## $Kriging_Output
## class      : RasterBrick 
## dimensions : 191, 309, 59019, 4  (nrow, ncol, ncell, nlayers)
## resolution : 0.01666667, 0.01666667  (x, y)
## extent     : 9.87486, 15.02486, 49.88319, 53.06653  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : var1.pred.1, var1.pred.2, var1.pred.3, var1.pred.4 
## min values :    269.5768,    266.5758,    265.7707,    261.4294 
## max values :    275.0120,    273.3299,    272.1343,    270.0602 
## 
## 
## $Kriging_SE
## class      : RasterBrick 
## dimensions : 191, 309, 59019, 4  (nrow, ncol, ncell, nlayers)
## resolution : 0.01666667, 0.01666667  (x, y)
## extent     : 9.87486, 15.02486, 49.88319, 53.06653  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : var1.stdev.1, var1.stdev.2, var1.stdev.3, var1.stdev.4 
## min values :    0.1251704,    0.1585938,    0.1427118,    0.1373274 
## max values :    0.1435884,    0.1745415,    0.1761995,    0.2816519

All the data has been downscaled and we do have uncertainties recorded for all of our outputs. Let’s visualise the data:

Plot_Krigs(SpatialPolygonsKrig, 
           Shp = Shape_shp,
           Dates = c("01-1995", "02-1995", "03-1995", "04-1995")
           )

As you can see, the elevation patterns show up clearly in our kriged air temperature output. Furthermore, you can see that our certainty of Kriging predictions drops on the 04/01/1995 in comparison to the preceding days. However, do keep in mind that a maximum standard error of 0.144, 0.175, 0.176, 0.282 (for each layer of our output respectively) on a total range of data of 5.435, 6.754, 6.364, 8.631 (again, for each layer in the output respectively) is evident of a downscaling result we can be confident in. We also demonstrated reliability of kriging in this publication (Figure 3).

Finally, this SpatialPolygons-informed downscaling took roughly 4 minutes on my machine (this may vary drastically on other devices).

Aggregate Uncertainty

Every climate data product is subject to an error-rate / range of data uncertainty. Unfortunately, almost none of the established climate data products communicate associated uncertainties. This leads to a dangerous overestimation of data credibility.

With the KrigR workflow, it is trivial to obtain uncertainty flags for all of your data - no matter the spatial or temporal resolution.

To understand the full certainty of our data obtained via the KrigR workflow, we should combine dynamical uncertainty with the statistical uncertainty we obtained from the krigR() function call above.

To do so, we require two data sets:
- SpatialPoylgonsKrig - created above containing statistical uncertainty in the second list position
- SpatialPoylgonsEns - created here; containing dynamical uncertainty

First, we assign them to objects with shorter names:

DynUnc <- SpatialPolygonsEns
KrigUnc <- SpatialPolygonsKrig[[2]]

Next, we need to align the rasters of statistical uncertainty (resolution: 0.017) and dynamical uncertainty (resolution: 0.5). As you can see, these are of differing resolutions and so cannot easily be combined using raster math. Instead, we first disaggregate the coarser-resolution raster (DynUnc) as disaggregation does not attempt any interpolation thus preserving the data, but representing it with smaller cells. To fix final remaining alignment issues, we allow for some resampling between the two raster:

EnsDisagg <- disaggregate(DynUnc, fact=res(DynUnc)[1]/res(KrigUnc)[1])
DynUnc <- resample(EnsDisagg, KrigUnc)

Finally, we combine the two uncertainty data products to form an aggregate uncertainty product:

FullUnc <- DynUnc + KrigUnc

Now, we are ready to plot our aggregate uncertainty:

Plot_Raw(FullUnc, 
         Shp = Shape_shp, 
         Dates = c("01-1995", "02-1995", "03-1995", "04-1995"), 
         COL = rev(viridis(100)))

As you can see, at short time-scales dynamic uncertainty eclipses statistical uncertainty. However, this phenomenon reverses at longer time-scales as shown in this publication (Figure 1).

Bioclimatic Variables with KrigR

To obtain bioclimatic data with KrigR we want to use the BioClim() function.

Bioclimatic variables are often treated as very robust metrics - I do not believe so. KrigR gives you access to a variety of specifications for your bioclimatic data needs.

Let’s start with the most basic of bioclimatic data products. So what are the specifications? Well, we:

  1. Query data for the period between 2010 (Y_start) and 2020 (Y_end, including 2020).
  2. Obtain data from the era5-land (DataSet) catalogue of data.
  3. Approximate water availability through precipitation (Water_Var) in keeping with typical practices.
  4. Extreme metrics for temperature minimum and maximum are calculated from daily (T_res) aggregates of the underlying hourly temperature data.

You will see function call to BioClim() wrapped in if statements which check for whether the output is already present or not. BioClim compilation can take significant time and I do this here to avoid recompilation on changes to the text of the blogpost on my end.

Setting the argument Keep_Monthly = TRUE will prompt the function to retain monthly aggregates of temperature and water availability alongside the final output. When BioClim() recognises that any of the underlying data is already present, it will skip the steps necessary to create this data.

if(!file.exists(file.path(Dir.Data, "Present_BC.nc"))){
  BC2010_ras <- BioClim(
      Water_Var = "total_precipitation",
      Y_start = 2010,
      Y_end = 2020,
      DataSet = "era5-land",
      T_res = "day",
      Extent = Shape_shp,
      Dir = Dir.Data,
      Keep_Monthly = FALSE,
      FileName = "Present_BC",
      API_User = API_User,
      API_Key = API_Key,
      Cores = numberOfCores,
      TimeOut = 60^2*48,
      SingularDL = TRUE,
      verbose = TRUE,
      Keep_Raw = FALSE,
      TryDown = 5
    )
}else{
  BC2010_ras <- stack(file.path(Dir.Data, "Present_BC.nc"))
}

Now let’s plot our results. Note that temperature is recorded in Kelvin and precipitation in cubic metres (i.e. litres). To do so, we use one of our user-defined plotting functions:

Plot_BC(BC2010_ras, Shp = Shape_shp)

There’s not much commenting on the output above as the output should look familiar to most macroecologists.

Third-Party Data and KrigR

Matching Third-Party Data

I expect that you won’t want to downscale to specific resolutions most of the time, but rather, match an already existing spatial data set in terms of spatial resolution and extent. Again, the KrigR package got you covered!

Usually, you probably want to downscale data to match a certain pre-existing data set rather than a certain resolution.

Here, we illustrate this with an NDVI-based example. The NDVI is a satellite-derived vegetation index which tells us how green the Earth is. It comes in bi-weekly intervals and at a spatial resolution of .08333 (roughly 9km). Here, we download all NDVI data for the year 2015 and then create the annual mean. This time, we do so for all of Germany because of its size and topographical variety.

Third-Party Data

Shape_shp <- ne_countries(country = "Germany")
## downloading gimms data
gimms_files <- downloadGimms(x = as.Date("2015-01-01"), # download from January 1982
                             y = as.Date("2015-12-31"), # download to December 1982
                             dsn = Dir.Data, # save downloads in data folder
                             quiet = FALSE # show download progress
                             )
## processing gimms data
gimms_raster <- rasterizeGimms(x = gimms_files, # the data we rasterize
                               remove_header = TRUE # we don't need the header of the data
                               )
indices <- monthlyIndices(gimms_files) # generate month indices from the data
gimms_raster_mvc <- monthlyComposite(gimms_raster, # the data
                                     indices = indices # the indices
                                     )
Negatives <- which(values(gimms_raster_mvc) < 0) # identify all negative values
values(gimms_raster_mvc)[Negatives] <- 0 # set threshold for barren land (NDVI<0)
gimms_raster_mvc <- crop(gimms_raster_mvc, extent(Shape_shp)) # crop to extent
gimms_mask <- KrigR::mask_Shape(gimms_raster_mvc[[1]], Shape = Shape_shp) # create mask ith KrigR-internal function to ensure all edge cells are contained
NDVI_ras <- mask(gimms_raster_mvc, gimms_mask) # mask out shape
NDVI_ras <- calc(NDVI_ras, fun = mean, na.rm = TRUE) # annual mean
writeRaster(NDVI_ras, format = "CDF", file = file.path(Dir.Data, "NDVI")) # save file

So what does this raster look like?

NDVI_ras
## class      : RasterStack 
## dimensions : 92, 108, 9936, 1  (nrow, ncol, ncell, nlayers)
## resolution : 0.08333333, 0.08333333  (x, y)
## extent     : 6, 15, 47.33333, 55  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## names      :     layer 
## min values : 0.2430333 
## max values : 0.8339083

And a visualisation of the same:

Plot_Raw(NDVI_ras, 
         Shp = Shape_shp,
         Dates = "Mean NDVI 2015", 
         COL = viridis(100, begin = 0.5, direction = -1))

As stated above, we want to match this with our output.

KrigR Workflow

We could do this whole analysis in our three steps as outlined above, but why bother when the pipeline gets the job done just as well? The pipeline is a way by which you can specify download and downscaling of ERA5(-Land) data with GMTED 2010 DEM covariate data in just one function call.

Matching Kriging outputs with a pre-existing data set is as easy as plugging the pre-existing raster into the Target_res argument of the krigR() or the download_DEM() function.

This time we want to downscale from ERA5 resolution (roughly 30km) because the ERA5-Land data already matches the NDVI resolution (roughly 9km). Here’s how we do this:

NDVI_Krig <- krigR(
  ## download_ERA block
  Variable = '2m_temperature',
  Type = 'reanalysis',
  DataSet = 'era5',
  DateStart = '2015-01-01',
  DateStop = '2015-12-31',
  TResolution = 'year',
  TStep = 1,
  Extent = Shape_shp,
  API_User = API_User,
  API_Key = API_Key,
  SingularDL = TRUE,
  ## download_DEM block
  Target_res = NDVI_ras,
  Source = "Drive",
  ## krigR block
  Cores = 1,
  FileName = "AirTemp_NDVI.nc",
  nmax = 80, 
  Dir = Dir.Exports)
## download_ERA() is starting. Depending on your specifications, this can take a significant time.
## User 39340 for cds service added successfully in keychain
## Staging 1 download(s).
## Staging your request as a singular download now. This can take a long time due to size of required product.
## 0001_2m_temperature_2015-01-01_2015-12-31_year.nc download queried
## Requesting data to the cds service with username 39340
## - staging data transfer at url endpoint or request id:
##   a99a448b-7075-4bf7-9b60-bd52fb0731c1
## - timeout set to 10.0 hours
## - polling server for a data transfer \ polling server for a data transfer | polling
## server for a data transfer / polling server for a data transfer - polling server for a
## data transfer \ polling server for a data transfer | polling server for a data transfer /
## polling server for a data transfer - polling server for a data transfer \ polling server
## for a data transfer | polling server for a data transfer / polling server for a data
## transfer - polling server for a data transfer \ polling server for a data transfer |
## polling server for a data transfer / polling server for a data transfer - polling server
## for a data transfer \ polling server for a data transfer | polling server for a data
## transfer / polling server for a data transfer - polling server for a data transfer \
## polling server for a data transfer | polling server for a data transfer / polling server
## for a data transfer - polling server for a data transfer
## 
  |                                                                                      
  |                                                                                |   0%
  |                                                                                      
  |=============================================================                   |  76%
  |                                                                                      
  |================================================================================| 100%
## - moved temporary file to -> C:/Users/erike/Documents/Work/Z - Conferences-Workshops_Presentations/2022_06_15-17 - [NordOikos]/KrigR Workshop/Exports/0001_2m_temperature_2015-01-01_2015-12-31_year.nc
## - request purged from queue!
## Checking for known data issues.
## Loading downloaded data for masking and aggregation.
## Masking according to shape/buffer polygon
## Aggregating to temporal resolution of choice
## 
  |                                                                                      
  |                                                                                |   0%
  |                                                                                      
  |===========================                                                     |  33%
  |                                                                                      
  |=====================================================                           |  67%
  |                                                                                      
  |================================================================================| 100%
## 
## Commencing Kriging
## 
  |                                                                                      
  |                                                                                |   0%
  |                                                                                      
  |================================================================================| 100%

As you can see, executing krigR() as a pipeline updates you on every single step along the way.

So? Did we match the pre-existing data?

NDVI_Krig[[1]]
## class      : RasterBrick 
## dimensions : 92, 108, 9936, 1  (nrow, ncol, ncell, nlayers)
## resolution : 0.08333333, 0.08333333  (x, y)
## extent     : 6, 15, 47.33333, 55  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## source     : memory
## names      : var1.pred 
## min values :  275.9705 
## max values :  285.7357

We nailed this!

Let’s take one final look at our (A) raw ERA5 data, (B) NDVI data, (C) Kriged ERA5 data, and (D) standard error of our Kriging output:

Click here for plotting calls
### ERA-Plot
ERA_df <- as.data.frame(raster(file.path(Dir.Exports, "2m_temperature_2015-01-01_2015-12-31_year.nc")), xy = TRUE) # turn raster into dataframe
colnames(ERA_df)[c(-1,-2)] <- "Air Temperature 2015 (ERA5)"
ERA_df <- gather(data = ERA_df, key = Values, value = "value", colnames(ERA_df)[c(-1,-2)]) #  make ggplot-ready
Raw_plot <- ggplot() + # create a plot
  geom_raster(data = ERA_df , aes(x = x, y = y, fill = value)) + # plot the raw data
  facet_wrap(~Values) + # split raster layers up
  theme_bw() + labs(x = "Longitude", y = "Latitude") + # make plot more readable
  scale_fill_gradientn(name = "Air Temperature [K]", colours = inferno(100)) + # add colour and legend
  geom_polygon(data = Shape_shp, aes(x = long, y = lat, group = group), colour = "black", fill = "NA") # add shape
### NDVI-Plot
NDVI_df <- as.data.frame(NDVI_ras, xy = TRUE) # turn raster into dataframe
colnames(NDVI_df)[c(-1,-2)] <- "NDVI 2015"
NDVI_df <- gather(data = NDVI_df, key = Values, value = "value", colnames(NDVI_df)[c(-1,-2)]) #  make ggplot-ready
NDVI_plot <- ggplot() + # create a plot
  geom_raster(data = NDVI_df , aes(x = x, y = y, fill = value)) + # plot the raw data
  facet_wrap(~Values) + # split raster layers up
  theme_bw() + labs(x = "Longitude", y = "Latitude") + # make plot more readable
  scale_fill_gradientn(name = "NDVI", colours = rev(terrain.colors(100))) + # add colour and legend
  geom_polygon(data = Shape_shp, aes(x = long, y = lat, group = group), colour = "black", fill = "NA") # add shape
### KRIGED-Plots
Dates = c("Kriged Air Temperature 2015 (NDVI Resolution)")
Type_vec <- c("Prediction", "Standard Error") # these are the output types of krigR
Colours_ls <- list(inferno(100), rev(viridis(100))) # we want separate colours for the types
Plots_ls <- as.list(NA, NA) # this list will be filled with the output plots
KrigDF_ls <- as.list(NA, NA) # this list will be filled with the output data
for(Plot in 1:2){ # loop over both output types
  Krig_df <- as.data.frame(NDVI_Krig[[Plot]], xy = TRUE) # turn raster into dataframe
  colnames(Krig_df)[c(-1,-2)] <- paste(Type_vec[Plot], Dates) # set colnames
  Krig_df <- gather(data = Krig_df, key = Values, value = "value", colnames(Krig_df)[c(-1,-2)]) # make ggplot-ready
  Plots_ls[[Plot]] <- ggplot() + # create plot
    geom_raster(data = Krig_df , aes(x = x, y = y, fill = value)) + # plot the kriged data
    facet_wrap(~Values) + # split raster layers up
    theme_bw() + labs(x = "Longitude", y = "Latitude") + # make plot more readable
    scale_fill_gradientn(name = "Air Temperature [K]", colours = Colours_ls[[Plot]]) + # add colour and legend
    theme(plot.margin = unit(c(0, 0, 0, 0), "cm")) + # reduce margins (for fusing of plots)
    geom_polygon(data = Shape_shp, aes(x = long, y = lat, group = group), colour = "black", fill = "NA") # add shape
  KrigDF_ls[[Plot]] <- Krig_df
} # end of type-loop
plot_grid(plotlist = list(Raw_plot, NDVI_plot, Plots_ls[[1]], Plots_ls[[2]]), 
          nrow = 2, labels = "AUTO")

So what can we learn from this? Let’s plot the relation between temperature and NDVI:

plot_df <- as.data.frame(cbind(KrigDF_ls[[1]][,4], 
                               KrigDF_ls[[2]][,4],
                               NDVI_df[,4]))
colnames(plot_df) <- c("Temperature", "Uncertainty", "NDVI")
ggplot(plot_df,
       aes(x = Temperature, y = NDVI, size = Uncertainty)) + 
  geom_point(alpha = 0.15) + 
  theme_bw()

Looks like NDVI increases as mean annual temperatures rise, but reaches a peak around 281-282 Kelvin with a subsequent decrease as mean annual temperatures rise higher.

Using Third-Party Data

ATTENTION: Kriging only works on square-cell spatial products!

The krigR() function is designed to work with non-ERA5(-Land) data as well as non-GMTED2010 covariate data. To downscale your own spatial products using different covariate data than the GMTED2010 DEM we use as a default, you need to step into the three-step workflow.

Most spatial products won’t be reliably downscaled using only elevation covariate data.

krigR() supports any combination of ERA5-family reanalysis, GMTED2010, third-party climate data, and third-party covariate data. Here, we just demonstrate the use of other covariates than the GMTED2010 used by KrigR by default.

The product we will focus on here stems from a data set I have prepared separately. In your Data directory, you will find a NETCDF called BCq_ras. This is a bioclimatic data set like the one we created previously. However, with this data set, the water availability has been assessed through the soil moisture data. With this data set, we also revert back to our original study region:

The reason we focus on soil moisture for this exercise? In this publication (Figure 3), we demonstrate that soil moisture data can be statistically downscales using kriging with some third-party covariates.

BCq_ras <- stack(file.path(Dir.Data, "Qsoil_BC.nc"))

Third-Party Data Covariates

In this publication, we demonstrate how soil moisture data can be reliably statistically downscaled using soil property data which we obtain from the Land-Atmosphere Interaction Research Group at Sun Yat-sen University.

Below, you will find the code needed to obtain the data of global coverage at roughly 1km spatial resolution. The code chunk below also crops and masks the data according to our study region and subsequently deletes the storage-heavy global files (3.5GB each in size). This process takes a long time due to download speeds.

Click here for download and cropping of soil moisture kriging covariates:
# documentation of these can be found here http://globalchange.bnu.edu.cn/research/soil4.jsp
SoilCovs_vec <- c("tkdry", "tksat", "csol", "k_s", "lambda", "psi", "theta_s") # need these names for addressing soil covariates
if(!file.exists(file.path(Dir.Covariates, "SoilCovs.nc"))){
  print("#### Loading SOIL PROPERTY covariate data. ####") 
  # create lists to combine soil data into one
  SoilCovs_ls <- as.list(rep(NA, length(SoilCovs_vec)))
  names(SoilCovs_ls) <- c(SoilCovs_vec)
  ## Downloading, unpacking, and loading
  for(Soil_Iter in SoilCovs_vec){
    if(!file.exists(file.path(Dir.Covariates, paste0(Soil_Iter, ".nc")))) { # if not downloaded and processed yet
      print(paste("Handling", Soil_Iter, "data."))
      Dir.Soil <- file.path(Dir.Covariates, Soil_Iter)
      dir.create(Dir.Soil)
      download.file(paste0("http://globalchange.bnu.edu.cn/download/data/worldptf/", Soil_Iter,".zip"),
                    destfile = file.path(Dir.Soil, paste0(Soil_Iter, ".zip"))
      ) # download data
      unzip(file.path(Dir.Soil, paste0(Soil_Iter, ".zip")), exdir = Dir.Soil) # unzip data
      File <- list.files(Dir.Soil, pattern = ".nc")[1] # only keep first soil layer
      Soil_ras <- raster(file.path(Dir.Soil, File)) # load data
      SoilCovs_ls[[which(names(SoilCovs_ls) == Soil_Iter)]] <- Soil_ras # save to list
      writeRaster(x = Soil_ras, filename = file.path(Dir.Covariates, Soil_Iter), format = "CDF")
      unlink(Dir.Soil, recursive = TRUE)
    }else{
      print(paste(Soil_Iter, "already downloaded and processed."))
      SoilCovs_ls[[which(names(SoilCovs_ls) == Soil_Iter)]] <- raster(file.path(Dir.Covariates, paste0(Soil_Iter, ".nc")))
    }
  }
  ## data handling and manipulation
  SoilCovs_stack <- stack(SoilCovs_ls) # stacking raster layers from list
  SoilCovs_stack <- crop(SoilCovs_stack, extent(BCq_ras)) # cropping to extent of data we have
  SoilCovs_mask <- KrigR::mask_Shape(SoilCovs_stack[[1]], Shape = Shape_shp) # create mask with KrigR-internal function to ensure all edge cells are contained
  SoilCovs_stack <- mask(SoilCovs_stack, SoilCovs_mask) # mask out shape
  ## writing the data
  writeRaster(x = SoilCovs_stack, filename = file.path(Dir.Covariates, "SoilCovs"), format = "CDF")
  ## removing the global files due to their size
  unlink(file.path(Dir.Covariates, paste0(SoilCovs_vec, ".nc")))
}

Let’s have a look at these data:

SoilCovs_stack <- stack(file.path(Dir.Covariates, "SoilCovs.nc"))
names(SoilCovs_stack) <- SoilCovs_vec
SoilCovs_stack
## class      : RasterStack 
## dimensions : 408, 648, 264384, 7  (nrow, ncol, ncell, nlayers)
## resolution : 0.008333333, 0.008333333  (x, y)
## extent     : 9.725, 15.125, 49.75, 53.15  (xmin, xmax, ymin, ymax)
## crs        : +proj=longlat +datum=WGS84 +no_defs 
## names      :         tkdry,         tksat,          csol,           k_s,        lambda,           psi,       theta_s 
## min values :  5.200000e-02,  1.337000e+00,  2.141000e+06,  5.212523e+00,  8.600000e-02, -5.307258e+01,  3.230000e-01 
## max values :  2.070000e-01,  2.862000e+00,  2.346400e+06,  2.461686e+02,  3.330000e-01, -5.205317e+00,  5.320000e-01

Now we need to establish target and training resolution of our covariate data.

First, we focus on the training resolution covariate data. We match our covariate data to our spatial product which we wish to downscale by resampling the covariate data to the coarser resolution:

Coarsecovs <- resample(x = SoilCovs_stack, y = BCq_ras)

Second, we aggregate the covariate data to our desired resolution. In this case, 0.02 as done previously:

Finecovs <- aggregate(SoilCovs_stack, fact = 0.02/res(SoilCovs_stack)[1])

Finally, we combine these into a list like the output of download_DEM():

Covs_ls <- list(Coarsecovs, Finecovs)
Plot_Covs(Covs = Covs_ls, Shape_shp)

Our development goals include creating a function that automatically carries out all of the above for you with a specification alike to download_DEM().

Kriging Third-Party Data

Finally, we can statistically downscale our soil moisture data using the soil property covariates. For this, we need to specify a new KrigingEquation.

With the KrigingEquation argument, you may specify non-linear combinations of covariates for your call to krigR().

If you don’t specify a KrigingEquation in krigR() and your covariates do not contain a layer called "DEM", krigR() will notify you that its default formula cannot be executed and will attempt to build an additive formula from the data it can find. krigr() will inform you of this and ask for your approval before proceeding.

This auto-generated formula would be the same as the one we specify here - an additive combination of all covariates found both at coarse and fine resolutions. Of course, this formula can also be specified to reflect interactive effects.

Here, I automate the generation of our KrigingEquation:

KrigingEquation <- paste0("ERA ~ ", paste(SoilCovs_vec, collapse = " + "))
KrigingEquation
## [1] "ERA ~ tkdry + tksat + csol + k_s + lambda + psi + theta_s"

I only hand the mean annual soil moisture layer to the kriging call and I set nmax to 80 to approximate a typical weather system in size:

BC_Water_Krig  <- krigR(Data = BCq_ras[[12]], 
                    Covariates_coarse = Covs_ls[[1]], 
                    Covariates_fine = Covs_ls[[2]],
                    KrigingEquation = KrigingEquation, 
                    FileName = "BC_Water_Krig",
                    Dir = Dir.Exports,
                    nmax = 80
)

Interpolating this data is just like statistically downscaling any other soil moisture product and can be done without any problems.

Look at how well the river Elbe sows up in this!

Plot_Krigs(BC_Water_Krig[-3], 
           Shp = Shape_shp,
           Dates = "BIO12 - Annual Mean Soil Moisture",
           columns = 2)

Projections and KrigR

I expect that you will often be interested not just in past and current climatic conditions, but also in future projections of climate data at high spatial resolutions.

The KrigR workflow can be used to establish high-resolution, bias-corrected climate projection products.

This time, we run our exercise for all of Germany because of its size and topographical variety.

Shape_shp <- ne_countries(country = "Germany")

KrigR Process for Projections

We published the the KrigR workflow for downscaled climate projections in this publication (Section 3.5) and I will walk you through the contents thereof here.

To achieve downscaled projection products we require three data products:
1. Historical climate data from ERA5(-Land)
2. Historical climate data from projection source
3. Future climate data from projection source

Subsequently, the data products are downscaled to the desired spatial resolution using krigR(). Finally, the difference between the downscaled projection-sourced data are added to the historical baseline obtained from (downscaled) ERA5(-Land) data. This achieves bias correction.

Obtaining ERA5(-Land) Data

Now, let’s obtain the historical baseline from ERA5-Land for the same time-period as our CMIP6 historical data.

if(!file.exists(file.path(Dir.Data, "Germany_Hist_ERA.nc"))){
  Hist_ERA_ras <- download_ERA(Variable = "2m_temperature",
                               DateStart = "1981-01-01",
                               DateStop = "1999-12-31",
                               TResolution = "month",
                               TStep = 1,
                               Extent = Shape_shp,
                               Dir = Dir.Data,
                               FileName = "Germany_Hist_ERA", 
                               API_Key = API_Key,
                               API_User = API_User,
                               SingularDL = TRUE)
  Index <- rep(1:12, length = nlayers(Hist_ERA_ras))
  Hist_ERA_ras <- stackApply(Hist_ERA_ras, indices = Index, fun = mean)
  writeRaster(Hist_ERA_ras, filename = file.path(Dir.Data, "Germany_Hist_ERA"), format = "CDF")
}
Hist_ERA_ras <- mean(stack(file.path(Dir.Data, "Germany_Hist_ERA.nc")))

Obtaining Projection Data

Here, we use CMIP6 projection data manually sourced from the ECMWF CDS distribution.

Our development goals include development of download_ERA() to work with other ECWMF CDS data sets aside from ERA5(-Land). This includes this CMIP6 data set.

Historical Baseline

train_HIST <- mean(stack(file.path(Dir.Data, "historical_tas_1981-2000.nc")))
train_HIST <- crop(train_HIST,extent(Hist_ERA_ras))
train_mask <- KrigR::mask_Shape(train_HIST, Shape_shp)
train_HIST <- mask(train_HIST, train_mask)

Future Projection

train_SSP <- mean(stack(file.path(Dir.Data, "ssp585_tas_2041-2060.nc")))
train_SSP <- crop(train_SSP,extent(Hist_ERA_ras))
train_mask <- KrigR::mask_Shape(train_SSP, Shape_shp)
train_SSP <- mask(train_SSP, train_mask)

Visualisation of CMIP6 Data

Plot_Raw(stack(train_HIST, train_SSP), 
         Shp = Shape_shp,
         Dates = c("Historic CMIP6", "Future CMIP6"))

Already, we can see that quite a bit of warming is projected to happen all across Germany. However, we want to know about this at higher spatial resolutions. That’s where KrigR comes in.

Establishing Kriged Products

For the first time in this workshop material, we will push our spatial resolution to the finest scale supported by our default GMTED 2010 DEM covariate data: 0.008333 / ~1km.

These operations take quite some time - grab a tea or coffee, go for a walk, or stretch a bit.

The downscaling calls should be familiar by now so I will forego explaining them. In case, the following code snippets do not make sense to you, please consult the portion of this workshop concerned with statistical downscaling.

Historical CMIP6

## Covariate Data
GMTED_DE <- download_DEM(
  Train_ras = train_HIST,
  Target_res = 0.008334,
  Shape = Shape_shp,
  Keep_Temporary = TRUE,
  Dir = Dir.Covariates
)
## Kriging
Output_HIST <- krigR(
  Data = train_HIST,
  Covariates_coarse = GMTED_DE[[1]], 
  Covariates_fine = GMTED_DE[[2]],  
  Keep_Temporary = FALSE,
  Cores = 1,
  Dir = Dir.Exports,  
  FileName = "DE_CMIP-HIST", 
  nmax = 40
)
Plot_Krigs(Output_HIST,
           Shp = Shape_shp,
           Dates = "CMIP6 Historical", columns = 2)

Future CMIP6

## Covariate Data
GMTED_DE <- download_DEM(
  Train_ras = train_SSP,
  Target_res = 0.008334,
  Shape = Shape_shp,
  Keep_Temporary = TRUE,
  Dir = Dir.Covariates
)
## Kriging
Output_SSP <- krigR(
  Data = train_SSP,
  Covariates_coarse = GMTED_DE[[1]], 
  Covariates_fine = GMTED_DE[[2]],   
  Keep_Temporary = FALSE,
  Cores = 1,
  Dir = Dir.Exports,  
  FileName = "DE_SSP585_2041-2060", 
  nmax = 40
)
Plot_Krigs(Output_SSP,
           Shp = Shape_shp,
           Dates = "CMIP6 Future", columns = 2)

Historical ERA5-Land

## Covariate Data
GMTED_DE <- download_DEM(
  Train_ras = Hist_ERA_ras,
  Target_res = 0.008334,
  Shape = Shape_shp,
  Keep_Temporary = TRUE,
  Dir = Dir.Covariates
)
## Kriging
Output_ERA <- krigR(
  Data = Hist_ERA_ras,
  Covariates_coarse = GMTED_DE[[1]], 
  Covariates_fine = GMTED_DE[[2]],   
  Keep_Temporary = FALSE,
  Cores = 1,
  Dir = Dir.Exports,  
  FileName = "DE_hist", 
  nmax = 40
)
Plot_Krigs(Output_ERA,
           Shp = Shape_shp,
           Dates = "ERA5-Land Historical", columns = 2)

Putting It All Together

To establish a final product of high-resolution climate projection data, we simply add the difference between the kriged CMIP6 products to the kriged ERA5-Land product:

## Creating Difference and Projection raster
Difference_ras <- Output_SSP[[1]] - Output_HIST[[1]]
Projection_ras <- Output_ERA[[1]] + Difference_ras
## Adding min and max values to ocean cells to ensure same colour scale
Output_ERA[[1]][10] <- maxValue(Projection_ras)
Output_ERA[[1]][12] <- minValue(Projection_ras)
Projection_ras[10] <- maxValue(Output_ERA[[1]])
Projection_ras[12] <- minValue(Output_ERA[[1]])
## Individual plots
A_gg <- Plot_Raw(Output_ERA[[1]], Shp = Shape_shp, 
                 Dates = "Historical ERA5-Land (1981-2000)")
B_gg <- Plot_Raw(Difference_ras[[1]], Shp = Shape_shp, 
                 Dates = "Anomalies of SSP585 - Historical CMIP-6",
                 COL = rev(viridis(100)))
C_gg <- Plot_Raw(Projection_ras[[1]], Shp = Shape_shp, 
                 Dates = "Future Projection (ERA5-Land + Anomalies)")
## Fuse the plots into one big plot
ggPlot <- plot_grid(plotlist = list(A_gg, B_gg, C_gg), 
                    ncol = 3, labels = "AUTO") 
ggPlot

Session Info

## R version 4.0.5 (2021-03-31)
## Platform: x86_64-w64-mingw32/x64 (64-bit)
## Running under: Windows 10 x64 (build 19043)
## 
## Matrix products: default
## 
## locale:
## [1] LC_COLLATE=English_United Kingdom.1252  LC_CTYPE=English_United Kingdom.1252   
## [3] LC_MONETARY=English_United Kingdom.1252 LC_NUMERIC=C                           
## [5] LC_TIME=English_United Kingdom.1252    
## 
## attached base packages:
## [1] parallel  stats     graphics  grDevices utils     datasets  methods   base     
## 
## other attached packages:
##  [1] mapview_2.10.2          rnaturalearthdata_0.1.0 rnaturalearth_0.1.0    
##  [4] gimms_1.2.0             ggmap_3.0.0             cowplot_1.1.1          
##  [7] viridis_0.6.0           viridisLite_0.4.0       ggplot2_3.3.6          
## [10] tidyr_1.1.3             KrigR_0.1.2             httr_1.4.2             
## [13] stars_0.5-3             abind_1.4-5             fasterize_1.0.3        
## [16] sf_1.0-0                lubridate_1.7.10        automap_1.0-14         
## [19] doSNOW_1.0.19           snow_0.4-3              doParallel_1.0.16      
## [22] iterators_1.0.13        foreach_1.5.1           rgdal_1.5-23           
## [25] raster_3.4-13           sp_1.4-5                stringr_1.4.0          
## [28] keyring_1.2.0           ecmwfr_1.3.0            ncdf4_1.17             
## 
## loaded via a namespace (and not attached):
##  [1] bitops_1.0-7             satellite_1.0.2          xts_0.12.1              
##  [4] webshot_0.5.2            tools_4.0.5              bslib_0.3.1             
##  [7] utf8_1.2.1               R6_2.5.0                 zyp_0.10-1.1            
## [10] KernSmooth_2.23-18       DBI_1.1.1                colorspace_2.0-0        
## [13] withr_2.4.2              tidyselect_1.1.0         gridExtra_2.3           
## [16] leaflet_2.0.4.1          curl_4.3.2               compiler_4.0.5          
## [19] leafem_0.1.3             gstat_2.0-7              labeling_0.4.2          
## [22] bookdown_0.22            sass_0.4.1               scales_1.1.1            
## [25] classInt_0.4-3           proxy_0.4-25             digest_0.6.27           
## [28] rmarkdown_2.14           base64enc_0.1-3          jpeg_0.1-8.1            
## [31] pkgconfig_2.0.3          htmltools_0.5.2          highr_0.9               
## [34] fastmap_1.1.0            htmlwidgets_1.5.3        rlang_0.4.11            
## [37] FNN_1.1.3                farver_2.1.0             jquerylib_0.1.4         
## [40] generics_0.1.0           zoo_1.8-9                jsonlite_1.7.2          
## [43] crosstalk_1.1.1          dplyr_1.0.5              magrittr_2.0.1          
## [46] Rcpp_1.0.7               munsell_0.5.0            fansi_0.4.2             
## [49] lifecycle_1.0.0          stringi_1.5.3            yaml_2.2.1              
## [52] plyr_1.8.6               grid_4.0.5               crayon_1.4.1            
## [55] lattice_0.20-41          knitr_1.33               pillar_1.6.0            
## [58] boot_1.3-27              rjson_0.2.20             spacetime_1.2-4         
## [61] stats4_4.0.5             codetools_0.2-18         glue_1.4.2              
## [64] evaluate_0.14            vctrs_0.3.7              png_0.1-7               
## [67] rmdformats_1.0.4         RgoogleMaps_1.4.5.3      gtable_0.3.0            
## [70] purrr_0.3.4              reshape_0.8.8            assertthat_0.2.1        
## [73] cachem_1.0.4             xfun_0.31                lwgeom_0.2-6            
## [76] e1071_1.7-6              rnaturalearthhires_0.2.0 class_7.3-18            
## [79] Kendall_2.2              tibble_3.1.1             intervals_0.15.2        
## [82] memoise_2.0.0            units_0.7-2              ellipsis_0.3.2